Practical 1: A control chart problem
part a - summary of waiting times
The Graph displays the daily waiting times period that the Data covers. It is difficult to notice any seasonality or cyclic patterns in the extreme waiting times.
As shown on the boxplot and on the table, Monday, Tuesday and Wednesday are the days of the week tht present the highests extreme values. We can also notice that even if Friday has the highest average waiting time, its maximum value is the lowest.
Moreover, the boxplot shows that the first 4 days of the week there are outliers, which are the part values that we analyse when working with extreme values.
| Weekday | Min | Q1 | Avg | Q3 | Max |
|---|---|---|---|---|---|
| Monday | 116 | 297 | 446 | 559 | 999 |
| Tuesday | 90 | 262 | 431 | 550 | 1154 |
| Wednesday | 95 | 293 | 471 | 597 | 1182 |
| Thursday | 81 | 217 | 373 | 487 | 938 |
| Friday | 148 | 342 | 485 | 622 | 918 |
part b - normal approximation
As the one year period corresponds to \[n = 250\], we can deduce the upper control limit, which corresponds to \[p = 1-1/n\] and equals to 0.996. After scaling this value, we can retrieve the associated quantile or the upper control limit.| x |
|---|
| 987 |
The upper control limit corresponds to 987.472.
The Graph suggest that the normal distribution is not appropriate to predict extreme values, the extreme observations tend to diverge strongly from their theorical values under the assumption of a normal distribution. This suggests that the tails are heavy.
Running the Shapiro-Wilk test shows that the hypothesis of normality is rejected at level 6.10510^{-10}
part c - analzying long waiting times
Block Maxima
The Block Maxima approach divides the dataset into blocks with the same number of observations. Then, for each block we select the observation with the highest value. Looking at the dataset, we can divide it by months or on a weekly basis :
Looking at the data aggregation per month and week, we can say that the first seems to be better. However in order to confirm this hypothesis, when modelling we are going to analyse both cases and keep the model that provides the best results. For the models, we are going to fit a GEV function on the maxima of each block, represented by the red dots in the following plots.
Peaks-over-threshold approach
The Peaks-over-threshold approach defines the extreme values as the observations above a certain threshold u. In order to find this threshold, we can use an mrlplot and the tcplot.
Looking at the plot, we believe that the best threshold is at a value of 800. However, we are going to see if this holds true by analyzing the results in case we use a threshold of 600.
We are going to fit a GPD on the difference between the observations highleted in the red dots and the thresholds. Once, using a threshold of 800 and then, of 600.
part d - modelling the extremes
Block Maxima
To see what kind of distribution to use, we have to estimate $, $, and $for the maxima. First, we will fit a model on the monthly maxima, then on the weekly ones.
Monthly :
#>
#> Call: fgev(x = maxima_monthly$Average.Wait.Seconds)
#> Deviance: 284
#>
#> Estimates
#> loc scale shape
#> 831.743 144.635 -0.202
#>
#> Standard Errors
#> loc scale shape
#> 38.518 30.520 0.273
#>
#> Optimization Information
#> Convergence: successful
#> Function Evaluations: 119
#> Gradient Evaluations: 87
The estimates of the GEV show that the latter is a Weibull: indeed, the shape parameter is negative (-0.202).
Looking at the plot, we remark that if on the one hand the probability plot is not well fitted, on the other hand the observations almost perfectly fit the straight line in the qq-plot. Moreover, the Return Level Plot shows a function that is concave: this is a consequence of the negative shape parameter.
We are now going to make the same analysis but this time using the weekly maxima.
Weekly:
#>
#> Call: fgev(x = maxima_weekly$Average.Wait.Seconds)
#> Deviance: 1236
#>
#> Estimates
#> loc scale shape
#> 585.613 170.495 -0.126
#>
#> Standard Errors
#> loc scale shape
#> 19.8480 14.0297 0.0748
#>
#> Optimization Information
#> Convergence: successful
#> Function Evaluations: 66
#> Gradient Evaluations: 38
Once again, we have a shape parameter that is negative, meaning that we are still in a Weibull.
Comparing the results between monthly blocks and weekly ones, we can clearly see that the results are better in the latter. Indeed, points are almost perfectly fitted for both the probability and quantile plot, which was not the case when we fitted the GEV to the monthly maxima. Thus, we can conclude that this for the block maxima approach the maxima from weekly blocks should be used.
POT
We are now going to fit a model using the Peak-over-threshold approach. The function we use for this model is fpot, which fits the difference between the observations and the threshold to a GPD. As we saw previously, two thresholds are likely to be interesting for the modeling: the first one is a threhsold u1 of 800, and the second is at 600. Like we did in the previous analysis with the block maxima, we are going to analyse both and select the best model.
The plots above are the results of the analysis using a threhsold of 600. We can see that the output are good, meaning that the model is well fitted to our data. Moreover, we can observe that once again the Return Level Plot has a function that is concave, suggesting a negative shape parameter for the model.
We are now going to do the same analysis using a threshold of 800
Naturally, having a higher threshold, the number of observations is lower. Looking at the results, we can conclude that we should select the threshold 600 since it allows having a better pp-plot and qq-plot.
part e - upper control limit
Since it is difficult to distingish which is the best model between the Block Maxima (weekly) and the Peaks-over-thresold (with u = 600), we are going to compute confidence line at the one year return level for both approaches.
Block Maxima
#> [1] 1111
Using the Block Maxima approach 1111.202 is the average waiting time that is expected to be exceeded once every 250 open days (one year).
#> # A tibble: 2 x 2
#> Date Average.Wait.Seconds
#> <date> <int>
#> 1 2017-10-03 1154
#> 2 2018-10-03 1182
Using the historical data, we notice that this thresold as been exceeded twice in two years.
POT
Using the POT approach, we conclude that 1128.807 is the average waiting time that is expected to be exceeded once every 250 open days (one year), or 0.004% of the time.
To conclude, we prefer to adopt a conservative approach and retain the confidence line with the lowest value, that is 1111.202, obtained with the Block Maxima approach.
Practical 2: Demand monitoring, part I
part a - out of stock products
The red line shows when the observed number of sales goes to zero, essentially the extreme of the sales, there are also other instances where it nears to zero however we ignore them for simplicity sake. We run of stock in these cases.
From the graph above, we can also notice that right before going out of stock, the sales are extremely high. When looking at the first block, from day 0 to day 330 which is the first stockout, the sales data are distributes as follows:
One can clearly notice that from the first few observations showed, there is a positive trend in the sales, which the vendors were not able to forecast hence the stockout at Day 330.
We can now look at what happened between the first and the second stockout. One can notice that following the first stockout there isn’t a particular structure in the sales data, it looks like the sales are random. Moreover, the stockout lasted 2 days. Hence, we can say that the second stockout is probably due to mismanagement following the initial positive trend of the data.
We can now observe what the sales data between day second and thrid time they went out of stock:
Looking at the sales data distribution in the above graph, one can notice that on the one hand the Sales demand until Day 600 seems kind of linear, but on the other hand following this day it looks like there is once again a positive trend.
Going over each block of data is not necessary as the observations above are already conclusive: the vendors go out of stock when there is an increasing demand. The sellers were probably not able to forecast the positive trends and to manage the inventory given the increasing sales, leading them to stockout.
part b - newsvendor model
Newsvendor model.
Knowing the newsvendor model we can easily compute p:
#> [1] 0.909
Alternatively, we could have used the Scperf library and the Newsboy function.
#> Q SS ExpC ExpP CV CR FR z
#> 142.04 41.24 50.03 857.21 0.31 0.91 0.99 1.34
Hence, the critical fractile will be computed as \[q = F^-1(0.91)\].
part c - critical fractile and Value at Risk
Value-at-Risk (VaR) also known as chance constraint is used in the newsvendor model as a constraint, limiting the probability of particular event (in this case stock-out) happening. In fact, we can say that the critical fractile of the newsvendor model is the equivalent the level of the Value-at-Risk (0.91): we don’t want to go beyond this value to avoid entering the extreme values.
Expected shortfall or conditional or sometimes called conditional value-at-risk , is about ‘how bad can things get?’. More concrectly, one could compute the Value-at-Risk at a level of 0.91, and compute the expected shortfall which indicates the average value of sales when we enter the extremes of the distribution, hence when the sales are extremely high which, as we saw in the beginning of this analysis, is likely to lead to a stockout.
part d - model using the poisson distribution
#> function(price,cost,unit_salvage){
#> salvage <- unit_salvage * cost
#> underage <- price - cost
#> overage <- cost - salvage
#> fractile <- underage / (overage + underage)
#> return(fractile)
#> }
Using the proposed function, we can fit the Poisson model to our data an compute the estimated μ by using the MLE.
After having fitted a Poisson to our data, we obtain μ = 100.8. This allows us to compute the critical fractile: according to this model, the estimated critical fractile is 114.
part e - peaks-over-thresholds model
We are now going to compute a POT model. First we have to define the threshold.
Thresholds of 150 and 170 seem interesting, we can test for both.
Comparing the pp-plots of the two models, we decide to use the threshold of 150 as it provides better results. Thus, we will use the model with this threshold to compute the Value at Risk. We will obtain the following Value at Risk:
#> [1] 145
part f - binomial backtesting
First we create the test set that we’re going to use as well as setting
Poisson model
#>
#> Exact binomial test
#>
#> data: sales_violations and nrow(sales_test)
#> number of successes = 89, number of trials = 300, p-value
#> <2e-16
#> alternative hypothesis: true probability of success is not equal to 0.091
#> 95 percent confidence interval:
#> 0.25 0.35
#> sample estimates:
#> probability of success
#> 0.3
Looking at the binomial test we see that the p-value is really low. Thus, we have to reject the null hypothesis, meaning that the observed number of violations is not equal to the expected number of violations.
POT model
#>
#> Exact binomial test
#>
#> data: sales_violations and nrow(sales_test)
#> number of successes = 15, number of trials = 300, p-value =
#> 0.01
#> alternative hypothesis: true probability of success is not equal to 0.091
#> 95 percent confidence interval:
#> 0.028 0.081
#> sample estimates:
#> probability of success
#> 0.05
At a level of 5%, we are once again going to reject the null hypothesis that the observed number of violations is the same as the estimated one. However, we draw a different conclusion at a level of 1%. In this case when looking at the p-value, one can observe that the null hypothesis can’t be rejected.
Practical 3: Demand monitoring, part II
part a - Analysing the risk in a multi-product context
The main challenge under this configuration is dealing the with the correlation between the products of the basket. As the VaR is not additive, figures of components of a basket do not add to the risk of the overall basket. This is because this measure does not take correlations into account and a simple addition could lead to double counting and underestimating the risk, especially when the tail dependence is rather high.
part b - Tail Dependence
Figure 3.1
The tail dependence can be measured thanks to the \(\bar{\chi}\) metric and is given by the following equation : \[\bar{\chi}(u) = 2*log(Pr[U > u])/log(Pr[U > u, V > u]) - 1\]
Figure 3.1 displays this mesure of dependence for each pair of products at the 90% quantile. We may easily notice that certain pairs of products are highly tail dependent.
| Product pair | \(\bar{\chi}\) |
|---|---|
| 1-5 | 0.362 |
| 2-3 | 0.355 |
| 5-6 | 0.337 |
| 1-6 | 0.330 |
| 3-4 | 0.304 |
Table 3.1 shows the top five pairs of product in terms of tail dependence.
Figure 3.2
We may also apply a tail dependence test for each pair of products, the null hypothesis being the tail dependence. Figure 3.2 shows that a majority of pair of products may be considered tail dependent since their p-value is superior to 5% for a probability thresold of 90%.
Figure 3.3
Chi Bar plot may also be usefull to detect tail dependency. As shown on Graph 3.3 , products 2 & 3 look strongly dependent in the upper tail.
Figure 3.4
Conversely, Figure 3.4 shows low tail-dependence for product 2 & 10.
part c - Gaussian Copula
The Gaussian copula is asymptotically independent in both upper and lower tails. Since a majority of pairs are dependent in the upper tail, the Gaussian copula seems inappropriate as it would lead to an under estimation of the VaR.
part d - VaR of a basket of product
i. Unifrom transformation
Prior to fitting the Copula, we apply a uniform transformation to the data. A gdp is fitted on data points above a thresold corresponding to the 90% quantile for each products. The instances below this quantile are treated in a different manner and are uniformised using their empircal cumulative distribution.
ii. Fiting Copulas
| Copula | Parameter 1 | Parameter 2 | Log-Likelihood | AIC |
|---|---|---|---|---|
| T | 0.070 | 9.51 | 256.1 | -508 |
| Gumbel | 1.074 | NA | 170.1 | -338 |
| Gauss | 0.084 | NA | 128.8 | -256 |
| Clayton | 0.056 | NA | 87.8 | -174 |
Table 3.2 displays the Copulas that have been fitted to the transformed data. With an AIC of -508, the T copula suits the best the data and will be retained for the following of our analysis.
iii. Data generation using fited copula
After selecting the best copula, we generate 1913 instances for each products to which we apply the inverse of the transformation that has been made in point i.
iv. Estimated VaR for S
The Value at Risk for the simulated data is obtained by taking the 95 % quantile of the row sums and corresponds to 111.332